Plant Phenomics
○ Elsevier BV
Preprints posted in the last 90 days, ranked by how well they match Plant Phenomics's content profile, based on 17 papers previously published here. The average preprint has a 0.02% match score for this journal, so anything above that is already an above-average fit.
Majid, M.; Tariq, H.; Mumtaz, I.; Kashif, M.
Show abstract
Image-based crop and pest recognition is considered useful for reducing the delay and cost of manual field scouting, therefore supporting timely intervention in precision-agriculture workflows. However, the real field imagery remains challenging due to the cluttered backgrounds, occlusions, illumination changes, and strong scale variation that are frequently observed across crops. The symptoms are often small or low-contrast, and pests may be partially hidden, which reduces the reliability when the setting is outside controlled environments. A unified multi-class crop-pest/condition recognition framework is presented, where a ResNet-50 backbone is utilized and enhanced with a Multi-Scale Contextual Attention (MSCA) module. The novelty is mainly considered to be achieved through the integration of explicit multi-scale contextual aggregation with lightweight joint channel and spatial attention by means of residual fusion, while the empirical evaluation was kept controlled under a fixed and reproducible protocol. A curated dataset of 21,404 field-style images covering 15 crop and pest/condition classes was compiled, and a leakage-aware fixed split with a held-out test set was adopted to support reproducibility. Augmentation was applied only to the training subset to improve robustness, although the validation data was not augmented in the same manner. On the held-out test set, balanced performance was achieved by the proposed approach, with about 0.93 accuracy and a macro-F1 score close to 0.94 being obtained, while established baselines such as EfficientNet, Vision Transformer, and attention-based CNN models were outperformed under identical evaluation settings. Controlled ablations were used to isolate the contribution of MSCA and augmentation under the same training configuration. These results indicate that lightweight multi-scale contextual attention is effective for crop and pest recognition under realistic field conditions, although some visually similar classes remained difficult.
Chiwele, N.; Sweeney, E.; Hossain, K.
Show abstract
Plant disease detection using deep learning is essential for precision agriculture, enabling early and automated crop health monitoring. This study proposes an end-to-end transfer learning pipeline, LeafyVGG-16, for multi-class classification of plant diseases and nutrient deficiencies using a tomato leaf dataset. The framework integrates data preprocessing, augmentation, and a VGG-16 backbone with a two-stage fine-tuning strategy. The proposed model is evaluated against CNN, DenseNet-121, Inception-V3, EfficientNetB0, and ResNet-50, achieving an accuracy of 0.93 with precision, recall, and F1-scores of 0.93, 0.90, and 0.92, respectively. These results demonstrate the effectiveness of transfer learning for fine-grained plant disease recognition. We further evaluate model robustness under adversarial cyber attacks to assess deployment reliability in agricultural systems. Under Fast Gradient Sign Method (FGSM) attacks ({epsilon} = 0.01- 0.05), the model shows an accuracy drop of 1%-7.5%, while Projected Gradient Descent (PGD) attacks ({epsilon} = 0.05, step size = 0.005, 10 iterations) produce similar degradation, highlighting the models vulnerability to adversarial perturbations. These findings highlight potential security and reliability risks in AI-based agricultural decision-making systems. Future work will focus on improving robustness and cyber-resilience and extending this framework to other crops for secure and context-aware deployment in resource-constrained environments.
Loayza, H.; Ninanya, J.; Palacios, S.; Silva, L.; Pujaico Rivera, F.; Rinza, J.; Gastelo, M.; Aponte, M.; Kreuze, J. F.; Lindqvist-Kreuze, H.; Heider, B.; Kante, M.; Ramirez, D. A.
Show abstract
Potato (Solanum tuberosum L.) is a staple crop crucial to global food security, yet its production is severely threatened by late blight (LB), caused by Phytophthora infestans, one of the most destructive plant diseases worldwide. Breeding programs for LB resistance have traditionally relied on labor-intensive and subjective visual assessments, which limit scalability and consistency, particularly in early-generation trials. Unmanned aerial vehicle (UAV)-based remote sensing combined with machine learning (ML) offers a promising alternative for objective, high-throughput disease phenotyping. This study evaluated the potential of UAV-derived multispectral imagery and ML techniques to estimate LB severity across large and genetically diverse potato breeding populations, comprising 2,745 clones in one trial and 492 accessions in another, conducted in Oxapampa, Pasco, Peru. We compared vegetation index-based approaches with a machine learning framework that integrates K-means clustering and Kernel Ridge Regression (KRR) and assessed their ability to capture genotypic variation and support selection decisions. NDVI consistently showed a strong correlation with visually assessed LB severity, particularly at advanced stages of disease development, enabling objective discrimination between healthy and diseased canopy tissues. However, the KRR-based approach outperformed linear NDVI-based models by capturing nonlinear relationships between spectral responses and disease progression. Estimates of LB severity derived from NDVI and KRR models, expressed as best linear unbiased estimates (BLUEs), showed strong and biologically consistent relationships with the area under the disease progress curve (AUDPC), particularly during later UAV acquisitions. Selection coincidence between UAV-derived estimates and AUDPC-based rankings was substantially higher at intermediate to advanced stages of disease progression, suggesting that UAV assessments at these stages may capture sufficient phenotypic variation to distinguish genotypes. These findings indicate that UAV-based multispectral phenotyping, especially when integrated with ML, provides a practical and scalable approach for assessing LB severity in potato breeding programs while reducing the need for time-consuming field evaluations.
Tan, D.
Show abstract
Accurate quantification of leaf lesion severity is essential for plant disease research and phenotyping but is often limited by subjective visual scoring and time-intensive manual image analysis. We present LIME, a fully automated, open-source image analysis pipeline for high-throughput quantification of leaf lesions from disease assay images. LIME integrates zero-shot leaf segmentation using the Segment Anything Model with a convolutional neural network for lesion area estimation. Applied to Arabidopsis thaliana leaves infected with Sclerotinia sclerotiorum, the proposed approach achieved a mean absolute percentage error of 12.9%, comparable to observed intrarater variability in manual scoring. Stratified evaluation across lesion-size groups demonstrated consistent prediction accuracy for small, intermediate, and large lesions, and comparative analysis showed that the deep learning-based model substantially outperformed color-based baseline methods. Under GPU-accelerated execution, LIME processed complete assays containing approximately 200 leaves in 15 minutes, representing an approximate 13-fold reduction in processing time relative to manual annotation. Together, these results indicate that LIME enables objective, reproducible, and scalable quantification of leaf lesion severity in standardized plant pathology assays. The pipeline is released as an open-source tool to support quantitative phenotyping studies.
Schlichtermann, R.-H.; Warnemuende, S.; Tietgen, H.; Welna, G.; Stahl, A.; Wittkop, B.; Snowdon, R.
Show abstract
Though currently a minor crop, faba bean is a promising source of plant-based protein as global diets shift towards more plant-based nutrition. To realise this potential, advances in breeding and cultivation are crucial. To exploit heterosis, faba bean breeding frequently utilises synthetic cultivars, which involves open pollination of inbred lines to produce a mixture of F1 hybrid seeds and self-pollinated offspring. Pure F1 hybrid cultivars are currently unavailable due to unstable cytoplasmic male sterility (CMS) systems. An ability to distinguish F1 seeds from their parental inbreds via characteristics associated with xenia effects could change this. The xenia effect refers to the influence of paternal pollen on seed traits, for example seed weight and cotyledon cells in faba bean. In this study, we exploited the xenia effect captured in hyperspectral imaging data to develop machine learning scenarios for discriminating between parental and F1 seeds of open pollinated synthetic combinations (Syn-1). The hyperspectral data were pre-processed using Savitzky-Golay filtering to reduce noise and smooth the spectra. Various machine learning algorithms were applied, incorporating Bayesian hyperparameter optimisation. The scenarios achieved up to 98.9 % accuracy in separating parental components of Syn-1. When including all seeds, the model achieved 40.7 %, indicating moderate detection and classification performance. As the harmonic mean of precision and recall, the F1 score accounts for both the correctness of F1 seed detections and the completeness with which F1 seeds were detected. While this approach does not yet enable the development of full hybrid cultivars, it paves the way for hybrid-enriched cultivars. These could help to streamline breeding for synthetic cultivars and potentially increase yields, for example by increasing the proportion of F1 hybrid seeds in synthetic cultivars. This study extends knowledge of the xenia effect in faba bean and provides a basis for further research aimed at enhancing breeding methods and productivity.
Prouvost, A.; Connesson, L.; Le Gourrierec, T.; Freville, H.; David, J.; Plessis, C.; Magnier, B.
Show abstract
Accurate and reproducible assessment of foliar disease severity is essential for evaluating the performance of heterogeneous plant communities and understanding host-pathogen interactions. However, traditional visual scoring methods remain subjective, with limited precision, and difficult to scale in large phenotyping experiments. Here, we present a semi-automated image analysis workflow designed to quantify multiple foliar disease symptoms simultaneously on wheat flag leaves sampled from varietal mixtures. The workflow combines three methodological components: (i) a standardized protocol for leaf sampling and imaging, (ii) supervised machine learning segmentation using Random Forest implemented in Ilastik to classify multiple symptoms (powdery mildew and yellow rust), and (iii) a graphical user interface facilitating pipeline deployment by non-specialist operators. To evaluate the influence of image representation on classification performance, four color spaces (RGB, HSV, HLS, LAB) were systematically compared. The approach was validated using images of durum wheat flag leaves collected from a field experiment assessing eight-way varietal mixtures under natural fungal pressure. Cross-validation against manually annotated images demonstrated high segmentation accuracy across all symptom. Comparison among color spaces revealed only minor differences in performance. Overall, this workflow offers a cost-effective, annotation-efficient and reproducible alternative to deep learning approaches, leveraging open-source and actively maintained tools while requiring limited training data and enabling objective, reproducible and scalable disease phenotyping.
Crabb, G. U.; Cevik, V.; Chen, X.; Priest, N. K.; Zhao, Y.
Show abstract
Plant pathogens cause major yield losses worldwide, threatening food security and livelihoods. Because early infection is difficult to diagnose, management often relies on prophylactic pesticide use, increasing costs and environmental impact. Here we present PSNet, a multimodal framework that fuses hyperspectral imaging with RGB information for presymptomatic plant disease detection, together with a low-cost, portable hyperspectral camera incorporating a 3D-printed housing and optical mounts, costing under {pound}500. We validate the approach using Arabidopsis thaliana infected with the oomycete Albugo candida. Imaging at 2 and 4 days post inoculation, prior to visible symptoms, revealed consistent spectral signatures that distinguished infected from healthy plants, while imaging at 6 days post inoculation captured the transition toward early symptom emergence. The most discriminative spectral regions overlapped wavelengths previously associated with plant responses to biotic stress, supporting the biological plausibility of these signatures. On a four-class task (healthy, 2 dpi, 4 dpi, 6 dpi), PSNet achieved 92.7% overall accuracy and 97.1% accuracy for binary healthy versus infected classification. Together, these results demonstrate that presymptomatic detection is feasible under controlled conditions using low-cost hardware and multimodal learning, underscoring the potential of scalable, multimodal systems for early disease monitoring.
Montesinos-Lopez, O. A.; Montesinos-Lopez, A.; Montesinos-Lopez, J. C.; Crossa, J.; Dreisigacker, S.; Hernandez-Suarez, C. M.; Ortiz, R.
Show abstract
Accurate modeling of genotype-by-environment (GxE) interaction is critical for genomic prediction in plant breeding but remains challenging due to complex interaction structures. Conventional models often use the Hadamard product of genotype and environment covariance matrices to capture joint similarity, which may not fully represent GxE complexity. Here we propose a novel framework that derives covariance structures from the matrix multiplication of genotype and environment kernels, decomposing these into symmetric components incorporated as random effects in mixed models. Evaluated for 11 wheat and rice multi-environment datasets and across, this approach consistently outperformed the traditional Hadamard-based model, improving prediction accuracy by up to 13.2% in Pearsons correlation and enhancing top-selection accuracy. Combining both methods yielded the highest performance, indicating complementary information capture. This framework offers a flexible, interpretable, and computationally feasible extension for modeling GxE interaction, potentially enhancing genomic selection effectiveness under diverse environmental conditions.
Saiz-Fernandez, I.; Bastidas Parrado, L. A.; Klimes, P.; Cavar Zeljkovic, S.; Ruiz de Galarreta, J. I.; Leyva-Perez, M. d. l. O.; Ortiz-Barredo, A.; Spichal, L.; De Diego, N.
Show abstract
Potato crop is highly vulnerable to abiotic stresses like salinity and low nutrient availability. Rapid identification of stress-resilient genotypes is therefore essential for breeding, yet conventional phenotyping is often slow, space-demanding and expensive. We present LOCOPOTS -- a LOw-COst high-throughput screening platform for in vitro POTatoes under abiotic Stress -- which combines individual in vitro plant culture, low-cost RGB imaging and machine-learning-based automatic segmentation using a trained model of a convolutional neural network, based on U-Net architecture. LOCOPOTS enabled the automated extraction of growth, colour, and vegetation-index traits and demonstrated robust performance across independent phenotyping rounds. We screened 30 potato varieties under control, low-nutrient and saltinity conditions, identifying contrasting growth and physiological responses. Integrated traits such as final area and height, Area_AUC and height_AUC, together with GLI, Chol, cive and chlorophyll fluorescence parameters, discriminated genotype performance under stress. Metabolic profiling further revealed genotype-specific reprogramming in carbon and nitrogen metabolism under low nutrition and salt stress, including changes in fructose, myo-inositol, {beta}-aminobutyric acid, {gamma}-aminobutyric acid, proline, and certain polyamines, identifying them as specific chemical biomarkers of plant stress responses. LOCOPOTS provides a scalable, affordable and space-efficient platform for early screening of potato genetic diversity and identification of candidate traits associated with stress resilience.
Mothukuri, S. R.; Massey-Reed, S. R.; Potgieter, A.; Laws, K.; Hunt, C.; Amuzu-Aweh, E. N.; Cooper, M.; Mace, E.; Jordan, D.
Show abstract
Lodging in sorghum presents a significant challenge for plant breeders due to the trade-off between lodging resistance and grain yield. Manually measuring lodging across thousands of plots is time-consuming, expensive, and error-prone, making selection for lodging resistance challenging in breeding programs. Unmanned Aerial Vehicle (UAV) derived metrics offer a potential high-throughput, cost-effective alternative for lodging phenotyping. This study developed a framework for predicting plot-level lodging from UAV imagery across 2,675 sorghum breeding plots. Multi-temporal canopy height data were collected at two critical time points: maximum crop height and at manual lodging assessment. Height percentiles were extracted from UAV derived point clouds generated using photogrammetric algorithms. These data were used to develop parametric, non-parametric, and ensemble prediction models, which were evaluated using three statistical metrics. The ensemble model, averaging predictions from all models, achieved the highest accuracy with Pearson correlations of r = 0.80-0.84 and lowest residual mean square error (RMSE=16-18), explaining 64-70% of variation in manual lodging counts. Model diagnostics and iterative refinement, including inspection of UAV imagery and dataset curation, had minimal impact on model performance, demonstrating the robustness of the approach. Model performance was consistent across sites, with minimal effects of stratified sampling on accuracy, confirming the ensemble approach as optimal for plot-level lodging assessment. This study demonstrates that integrated multi-temporal UAV imagery offers a practical alternative to labor-intensive manual evaluation methods by enabling high-throughput lodging assessment suitable for implementation in sorghum breeding programs.
Zhou, S.; Zhao, T.
Show abstract
Genotype-by-environment interactions are central to crop adaptation and yield stability, yet they remain difficult to model for robust prediction across heterogeneous environments. Although enviromic profiling has improved the characterization of dynamic field conditions, most existing genomic prediction methods adopt a late-fusion strategy that encodes genomic and environmental information independently before global integration, thereby limiting their ability to resolve fine-scale, context-dependent G x E effects. Here, we developed GE-BiCross, a hierarchical bidirectional cross-attention framework for maize prediction. GE-BiCross incorporates a dual-path feature extraction module to disentangle independent and cooperative effects, a tokenized bidirectional cross-attention module to enable reciprocal genotype-environment interaction learning, and a mixture-of-experts module to adaptively capture heterogeneous response patterns across environments. Using a large-scale dataset of approximately 360,000 observations from 4,923 maize hybrids evaluated in 241 environments, GE-BiCross consistently outperformed conventional genomic prediction, machine learning, and deep learning baselines across six agronomic traits. The greatest improvements were observed for environmentally responsive and genetically complex traits. In particular, GE-BiCross achieved an R2 of 0.672 for grain yield and 0.880 for grain moisture, significantly surpassing all comparison models. Ablation analyses demonstrated that the three core modules make distinct and complementary contributions to predictive performance.These results show that deep, bidirectional integration of genomic and enviromic information can substantially improve modeling of complex G x E interactions, providing a powerful framework for interpretable genomic prediction and climate-smart crop breeding.
Halpin-McCormick, A.; Nalla, M. K.; Radlicz, Z.; Zhang, A.; Fumia, N.; Lin, T.-h.; Lin, S.-w.; Wang, Y.-w.; Zohoungbogbo, H. P. F.; Wang, D. R.; Runck, B.; Gore, M. A.; Kantar, M. B.; Barchenger, D. W.
Show abstract
Climate change increasingly threatens global Capsicum (pepper) production. Accelerating the deployment of climate-resilient cultivars requires effective use of genetic diversity conserved in genebanks. We implement a "turbocharging" strategy in Capsicum by integrating genome-wide association studies and genomic prediction in a core collection (n = 423), followed by genomic prediction across the global collection (n = 10,250) using the core as a training population. We generated genomic estimated breeding values (GEBVs) for 31 high-accuracy traits (r > 0.5) encompassing hyperspectral phenotypes (heat/control), agronomic performance (heat/control) and fruit quality. To enhance accessibility and decision-making, we developed a large language model (LLM) integrated application that enables flexible, preference-based selection of candidates. By narrowing the parental decision space, this framework streamlines screening of large germplasm collections while balancing climate resilience, quality attributes and market demands. Our approach provides a scalable decision-support system to accelerate climate-resilient Capsicum breeding and maximize global genetic resources.
Bienvenu, C.; Roger, J.-M.; Sene, M.; Castro Pacheco, S. A.; Singer, M.; Felaniaina, B. L.; Terrier, N.; De Bellis, F.; Pot, D.; DE VERDAL, H.; Segura, V.
Show abstract
Phenomic prediction (PP) is a breeding value prediction method using near infrared spectroscopy (NIRS). Spectra pre-processing is a key step in the analysis pipeline of PP and generally involves chemometrics methods. However, there is still little understanding in the genetics community of what pre-processing does and why it increases performances. Consequently, the choice of pre-processing is done either arbitrarily or through a search of the optimal set of methods and associated parameters. In this study, we propose a PCA-based pre-processing method where genetic values of spectra are estimated on a set of principal components instead of individual wavelengths. This way, estimations are based on a few informative and orthogonal features of spectra instead of many correlated, uninformative wavelengths. We tested this new pre-processing method on five data sets representing four plant species (maize, rice, sorghum and grapevine). Results show that it performs as good, or better than the best classical chemometric pre-processing methods in almost all cases. Combining PCA-based and classical chemometric pre-processing methods maximizes predictive ability. Moreover, this pre-processing method opens up possibilities of better understanding and selecting parts of the spectral information that are relevant for the prediction of breeding values. Indeed, components representing together about 1% of spectral variability were found to be responsible for most of PP predictive ability. Plain language summaryCultivated plants are the result of a breeding process during which their genetic values are used to select those to breed. Estimation of breeding values requires heavy experimental means and is time consuming. Phenomic prediction is a low cost and high throughput genetic value estimation method that is increasingly being used. It often uses near infrared spectroscopy measurements as predictors of genetic values that are easy to collect and thus routinely used in many species. However, near infrared spectra generally require pre-processing before being used in prediction. Currently used pre-processing methods arise from the chemometrics community, and still deserve a better in-depth appropriation by geneticists. In this study, we propose a new pre-processing approach that performs as good as or better than the best chemometric pre-processing generally used, reduces computation time, and allows for a better understanding of what parts of spectral information are relevant for prediction. Core IdeasO_LIWorking on principal components of spectra instead of wavelengths increases predictive ability of phenomic prediction and performs as good as or better than classical chemometrics pre-processing C_LIO_LIWorking on principal components of spectra requires less optimization of parameters than chemometrics pre-processing C_LIO_LIAbout 1% of spectral variance is responsible for most of the predictive power of phenomic prediction C_LIO_LIWorking on principal components of spectra pre-processed with classical chemometrics pre-processing can increase predictive ability even more C_LIO_LIPCA-based methods are valuable to optimize predictive ability of phenomic prediction and could be used more widely in the quantitative genetics field C_LI
Cerimele, G.; Kent, M.; Miller, M.; Best, R.; Franks, C.; Kakar, N.; Felderhoff, T.; Sexton-Bowser, S.; Morris, G. P.
Show abstract
Bioavailability of iron, an essential micronutrient to plants, is low in alkaline or calcareous soils, which are prevalent across semi-arid production regions. Breeding efforts to increase tolerance to iron deficiency chlorosis (IDC) in sorghum, a major crop of semi-arid regions, are confounded by spatial variation of stress severity in field trials. Here we developed and validated two high-throughput phenotyping approaches to address this challenge, with multi-spectral aerial imaging in the field and a controlled-environment assay to isolate the effects of iron bioavailability. In the field, severity and uniformity of stress are highly predictive of genetic signals for IDC tolerance (R2 > 0.6 for soil pH metrics and H2). Plot-level data filtering for stress conditions based on control genotypes successfully addresses field spatial variation (unfiltered H2 = 0.18 vs. filtered H2 = 0.4). The controlled-environment assay proxies field stress using iron sources with differential bioavailability, evidenced by high heritability ( H2 = 0.98) and phenotypic differential for hybrid control genotypes that matches field performance. Finally, we show that assay phenotypes are suitable for genome-wide association studies in global germplasm. Together, these field and lab phenomic approaches can be deployed to understand genetics of IDC tolerance and develop crops resilient to alkaline soils. HIGHLIGHTStress severity and uniformity greatly impact detection of genetic signals underlying iron deficiency chlorosis tolerance in sorghum. A controlled-environment assay reduces spatial heterogeneity and improves assessment of tolerance genetics.
Cazon, L. I.; Paredes, J. A.; Quiroga, M.; Guzman, F.
Show abstract
Potato common scab (Streptomyces sp.) is an economically important disease that reduces the quality and market value of tubers. A key aspect in developing management strategies involves accurately quantifying the disease. Due to the three-dimensional nature of the tuber and the heterogeneous distribution of lesions across its surface, visual estimates of severity can be challenging. Therefore, the objectives of this study were to develop and validate a standard area diagram (SAD) for estimating common scab severity on potato tubers and to compare validation outcomes obtained using real tubers and digital images. A SAD comprising six severity levels (from 1.3 to 66.8%) was developed based on image analysis of naturally infected tubers. Validation was conducted using two complementary approaches in which inexperienced raters evaluated either real potato tubers or digital images of the same tubers under unaided and aided conditions. Accuracy, bias components, and inter-rater reliability were quantified using absolute error metrics, Lins concordance correlation coefficient, intraclass correlation coefficients, and overall concordance correlation coefficients. Use of the SAD significantly improved accuracy, reduced systematic bias, and increased inter-rater reliability across both validation approaches. No significant differences were detected between assessments conducted on real tubers and images, although image-based evaluations showed a slight, non-significant tendency toward reduced scale and location bias under aided conditions. These results demonstrate that a dimension-aware SAD integrating information across the full tuber surface enhances the reliability and reproducibility of visual severity assessments and supports the use of image-based evaluations for training, large-scale surveys, and remote or collaborative applications involving three-dimensional plant organs.
Konrai, K.; Ito, R.; Sunayama, S.; Omura, K.; Isagi, Y.; Kitajima, K.; Onoda, Y.
Show abstract
PremiseElliptic Fourier analysis is widely used to quantify leaf shape variation, but inconsistent normalization and orientation alignment can introduce biologically irrelevant variation. In addition, a reproducible workflow from raw images to normalized elliptic Fourier descriptors (EFDs) is still lacking. Methods and ResultsWe developed LeafContourEFD, a GUI application for reproducible leaf morphometrics. It integrates image segmentation, contour extraction, EFD calculation, and an extended normalization framework, termed oriented true EFD normalization, based on a user-defined biological reference axis. Analyses of Quercus serrata, Q. crispula, and Triadica sebifera showed that existing normalization methods can introduce orientation-related variance when the first-harmonic major axis does not match the leaf base-to-tip axis. In contrast, oriented true normalization removed these artifacts, yielding clearer shape transitions along principal components allowing shape variation among leaves to be captured while preserving biologically meaningful lateral asymmetry. ConclusionsLeafContourEFD improves interpretability and reproducibility in outline-based morphometrics and provides transparent outputs and metadata for data sharing and cross-study comparisons.
Yadav, V.; Mishra, D. S.; Rane, J.; Apparao, V. V.; Dembure, L.; Ravat, P.; Abadura, N. A.; Kumar, P.; Anokye, B.; sahild, A.; Devi, P.; Amoah, P.
Show abstract
This study integrated morphometric characterization and machine-learning modelling to identify key predictors of yield in Annona reticulata under semi-arid conditions. Thirty-one canopy, fruit, seed, and biochemical traits were evaluated across 62 genotypes, revealing substantial phenotypic diversity, particularly in structural attributes such as tree growth nature and branch angle. Principal Component Analysis and hierarchical clustering differentiated genotypes into three ideotypes representing high-yielding, structurally stable, and quality-oriented groups. Random Forest modelling and SHapley Additive exPlanations (SHAP) interpretation consistently highlighted leaf breadth, leaf length, fruit shape, and pulp-associated traits as dominant yield predictors, underscoring the coordinated influence of source-sink balance. Integration of SHAP importances with trait stability (CV%) further revealed that moderately variable traits provide reliable selection indices. These findings demonstrate that yield performance is governed by multivariate trait networks rather than isolated descriptors. The proposed framework provides a robust basis for precision phenotyping and strategic parent selection to develop high-yielding, nutritionally enriched, and climate-resilient custard apple cultivars.
Ayub, Y.; McGuire-Scullen, S.; Percival, S.; Weaver, W. N.; Karki, N.; Yahiaoui, W.; Astudillo-Pavon, K.; Barrios, A.; Check, J. C.; Colchado-Lopez, J.; Dolgikh, B. A.; Espinosa-Martinez, D. V.; Fu, Q.; Galvan-Lara, K. M.; Garcia-Chavez, J. N.; Garcia-Rios, S.; Grabb, C. N.; Guadir-Lara, G. E.; Hawkins, J. C.; Hendrickson, C. L.; Hightower, A. T.; Hurtado-Olvera, J. J.; Kianian, S.; Lennon, J.; Li, Z.; Li, J.; Lieb, B.; Lin, J.; Lopez-Sanchez, P.; Luna-Alvarez, M.; Martinez-Martinez, C.; Montemayor-Lara, a.; Moreno, N. A.; Obisesan, I. A.; Perez-Flores, O.; Pimentel-Ruiz, C.; Pineda-Hernandez,
Show abstract
(1) RationaleQuantifying and predicting plant morphology is central to understanding development and evolution, yet many plant forms lack homologous features required for traditional morphometrics. We apply the Euler Characteristic Transform (ECT), an injective descriptor from topological data analysis, to encode 2D plant shapes. The ECT converts contours into image-like representations that preserve shape information while enabling deep learning. (2) MethodsWe computed ECTs for large datasets of leaf and pavement cell shapes and used convolutional neural networks (CNNs) for classification. We also trained CNNs to approximate the inverse mapping, predicting leaf shape masks from radial ECTs. (3) Key resultsECT-based models achieved high classification accuracy, surpassing previous approaches on millions of herbarium-derived leaves. Notably, grapevine leaf venation was predicted from blade geometry alone, demonstrating that vascular structure is encoded in the outline. (4) Main conclusionThe ECT provides a compact, information-preserving representation of biological shape that integrates naturally with deep learning. It enables both accurate classification and predictive reconstruction, revealing latent morphological information and offering new opportunities to study plant form across scales.
Youssef, A.; Badreldin, N.
Show abstract
The Digital Pedon (DP) is an open-source Python framework that represents a soil profile as a continuously updated digital twin, bridging three persistent gaps in soil science: disconnected models and observations, cross-database interoperability, and the inference gap between raw sensor signals and agronomically meaningful variables. Integrating real-time sensor streams, model-based solver chains (Model-Zoo), GLOSIS-compliant ontology mapping, and a novel LLM agentic interface layer enabling natural language soil queries, the DP supports applications spanning precision agriculture, digital soil mapping, and environmental sustainability assessment. Four proof-of-concept experiments confirm automatic profile initialisation fidelity, solver chain consistency, ontology compliance, and user-defined solver extensibility.
Mehrem, S. L.; Zijl, A.; de Haan, M.; Van den Ackerveken, G.; Snoek, B. L.
Show abstract
Lettuce (Lactuca sativa) is an important field crop, but our understanding of its phenotypic variation and underlying genetics under natural field conditions remains limited, posing challenges for identifying effective crop breeding targets. Longitudinal hyperspectral phenotyping allows for non-invasive monitoring of crop performance under diverse agricultural conditions. In this study, we used hyperspectral imaging to assess the phenotypic variation of almost 200 different field-grown lettuce varieties, following the same plants from just after seedling- to flowering-stage. With automated image processing, we extracted a wide range of spectral phenotypes related to metabolite content, growth efficiency, and environmental stress responses, creating a multi-dimensional time-resolved data set. Principal component analysis (PCA) revealed the major axes of spectral variation over time, and highlighted differences in spectral patterns among lettuce genotypes. Integrating on-site weather data, we modelled GxE interactions of reflectance, revealing regions of the lettuce vegetation spectrum that are primarily shaped by genotype and/or environment. We estimated phenotypic plasticity in response to time, temperature and rainfall using best linear unbiased predictions (BLUPs), capturing genotype-specific developmental trajectories and responses to the environment. We used genome-wide association studies (GWAS) to identify quantitative trait loci (QTLs) of PC-based, single and BLUP-based phenotypes, disentangling the genetic architecture of spectral lettuce phenotypes from major axes of variation down to single wavelength spectral plasticity. These findings provide new insights into the genome-wide genetic regulation and dynamics of spectral phenotypes in field grown lettuce.